In the traditional education system, teachers and professors play a crucial role in teaching and guiding students. However, when it comes to examinations, tasks such as preparing question papers and evaluating students’ answers are carried out manually by teachers and professors. Evaluating examination answers is one of the most challenging tasks, and manual correction may sometimes lead to errors. To bridge this gap, was developed, which utilizes artificial intelligence to automate the examination evaluation process. This helps in reducing the workload of teachers and allows them to focus more on student improvement activities. is designed as an AI-based system that provides automated question paper generation, answer evaluation, and role-based access for students, teachers, and administrators, along with basic examination monitoring features. For question paper generation and answer evaluation, the system uses AI techniques along with NLP and large language models. This approach helps to reduce workload, save time, and minimize human errors in the examination process.
Introduction
Examinations are central to assessing students’ knowledge and academic performance, but traditional manual methods of question preparation and answer evaluation are time-consuming, error-prone, and increasingly inefficient. To address these challenges, the paper proposes an AI-based automated examination framework that integrates question paper generation and descriptive answer evaluation using Natural Language Processing (NLP). The system generates syllabus-aligned questions with controlled difficulty levels and evaluates student responses based on semantic similarity rather than keyword matching, ensuring fair and consistent scoring. Implemented as a prototype, the framework demonstrates improved evaluation efficiency, consistent scoring, and streamlined examination workflows, reducing teachers’ administrative workload while allowing human oversight when necessary.
Conclusion
This paper presented an AI-based automated examination system that focuses on question paper generation and answer evaluation. The proposed framework addresses the limitations of traditional manual examination processes by reducing workload and improving evaluation consistency. By using Artificial Intelligence and Natural Language Processing techniques, the system is able to generate questions based on syllabus requirements and evaluate descriptive answers based on conceptual understanding.
The results show that the proposed system can reduce evaluation time and provide consistent scoring when compared to manual assessment. Although human involvement is still important for special cases, the automated framework supports teachers by handling routine examination tasks efficiently. Overall, the proposed approach demonstrates that AI-based examination systems can improve efficiency, reliability, and fairness in academic assessment processes.
References
[1] M. A. Tayal, R. Joshi, M. Darvekar, M. Malghade, and C. Sonboir, “Automated Exam Paper Checking Using Semantic Analysis,” in OCIT 2023 - 21st International Conference on Information Technology, Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 957–962. doi: 10.1109/OCIT59427.2023.10431267.
[2] B. Keskin and M. Günay, “Automated-Computer Based Assessment of Free-Text Exam Answers by Transformers,” in UBMK 2024 - Proceedings: 9th International Conference on Computer Science and Engineering, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 181–186. doi: 10.1109/UBMK63289.2024.10773468.
[3] B. Das, M. Majumder, S. Phadikar, and A. A. Sekh, “Automatic question generation and answer assessment: a survey,” Dec. 01, 2021, Springer. doi: 10.1186/s41039-021-00151-1.
[4] C. Xiao et al., “Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs,” Jun. 2024, doi: 10.1145/3706468.3706507.
[5] A. Katz, M. Gerhardt, and M. Soledad, “Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching,” Mar. 2024, [Online]. Available: http://arxiv.org/abs/2403.11984
[6] G. Xiao, Z. Chen, and S. Rothkugel, “A Comprehensive Framework for AI-Driven Multiple-Choice Question Generation with Cognitive Taxonomy Alignment and minimal Item-Writing Flaws.”
[7] A. Brasoveanu, M. Moodie, and R. Agrawal, “Textual evidence for the perfunctoriness of independent medical reviews,” in CEUR Workshop Proceedings, CEUR-WS, 2020, pp. 1–9. doi: 10.1145/nnnnnnn.nnnnnnn.
[8] M. Shaukat, M. Tanzeem, T. Ahmad, and N. Ahmad, “Semantic similarity–based descriptive answer evaluation,” 2021, pp. 221–231. doi: 10.1016/B978-0-12-822468-7.00014-6.
[9] P. C. Kaygusuz, “TRANSITION TO MODULAR ARCHITECTURE IN MOBILE FINANCE APPLICATIONS,” PressAcademia Procedia, vol. 20, no. 1, pp. 10–13, 2024, doi: 10.17261/Pressacademia.2024.1917.
[10] B. Abu Shunnar, N. Johnson, M. Alkobaisi, and S. Alkobaisi, “Transforming Language Assessments through AI: Enhancing Accuracy, Efficiency, and Personalization at Abu Dhabi Polytechnic,” European Journal of Teaching and Education, vol. 7, no. 3, pp. 58–70, Sep. 2025, doi: 10.33422/ejte.v7i3.1600.